106

9

Probability and Likelihood

The reader can readily verify that a plot of bb versus kk is a hump whose central

term occurs at m equals left bracket left parenthesis n plus 1 right parenthesis p right bracketm = [(n + 1)p], where the notation left bracket x right bracket[x] signifies “the largest integer

not exceeding xx”.

An important practical case arises where nn is large and pp is small, such that the

product n p equals lamdanp = λ is of moderate size (tilde 11). The distribution can then be simplified:

StartLayout 1st Row b left parenthesis k semicolon n comma p right parenthesis equals StartBinomialOrMatrix n Choose k EndBinomialOrMatrix StartFraction lamda Over n EndFraction Superscript k Baseline left parenthesis 1 minus StartFraction lamda Over n EndFraction right parenthesis Superscript n minus k Baseline equals StartFraction lamda Superscript k Baseline Over k factorial EndFraction left parenthesis 1 minus StartFraction lamda Over n EndFraction right parenthesis Superscript n minus k Baseline StartFraction n left parenthesis n minus 1 right parenthesis midline horizontal ellipsis left parenthesis n minus k plus 1 right parenthesis Over n Superscript k Baseline EndFraction period EndLayoutb(k; n, p) =

(n

k

)λ

n

k (

1λ

n

)nk

= λk

k!

(

1λ

n

)nk n(n1) · · · (nk + 1)

nk

.

Now, left parenthesis 1 minus lamda divided by n right parenthesis Superscript n minus k Baseline almost equals e Superscript negative lamda(1λ/n)nk eλ and n left parenthesis n minus 1 right parenthesis period period period left parenthesis n minus k plus 1 right parenthesis divided by n Superscript k Baseline almost equals 1n(n1)...(nk + 1)/nk 1; hence,

b left parenthesis k semicolon n comma p right parenthesis almost equals StartFraction lamda Superscript k Baseline Over k factorial EndFraction e Superscript negative lamda Baseline equals p left parenthesis k semicolon lamda right parenthesis commab(k; n, p)λk

k! eλ = p(k; λ) ,

(9.28)

which is called the Poisson approximation to the binomial distribution. However, if

lamdaλ is fixed, thensigma summation p left parenthesis k semicolon lamda right parenthesis equals 1E p(k; λ) = 1; hence,p left parenthesis k semicolon lamda right parenthesisp(k; λ), the probability of exactlykk successes

occurring, is a distribution in its own right, called the Poisson distribution. It is of

great importance in nature, describing processes lacking memory.

The probability f left parenthesis k semicolon r comma p right parenthesis f (k;r, p) that exactly kk failures precede the rrth success (i.e.,

exactly kk failures among r plus k minus 1r + k1 trials followed by success) is

f left parenthesis k semicolon r comma p right parenthesis equals StartBinomialOrMatrix r plus k minus 1 Choose k EndBinomialOrMatrix p Superscript r Baseline q Superscript k Baseline equals StartBinomialOrMatrix negative r Choose k EndBinomialOrMatrix p Superscript r Baseline left parenthesis negative q right parenthesis Superscript k Baseline comma k equals 0 comma 1 comma 2 comma ellipsis period f (k;r, p) =

(r + k1

k

)

prqk =

(r

k

)

pr(q)k, k = 0, 1, 2, . . . .

(9.29)

Iff 9

sigma summation Underscript k equals 0 Overscript normal infinity Endscripts f left parenthesis k semicolon r comma p right parenthesis equals 1 comma

E

k=0

f (k;r, p) = 1 ,

(9.30)

the possibility that an infinite sequence of trials produces fewer thanrr successes can

be discounted, since by the binomial theorem

sigma summation Underscript k equals 0 Overscript normal infinity Endscripts StartBinomialOrMatrix negative r Choose k EndBinomialOrMatrix left parenthesis negative q right parenthesis Superscript k Baseline equals p Superscript negative r Baseline comma

E

k=0

(r

k

)

(q)k = pr ,

(9.31)

which equals 1 when multiplied byp Superscript rpr. The sequencef left parenthesis k semicolon r comma p right parenthesis f (k;r, p) is called the negative

binomial distribution.

Example. Suppose that the normal rate of infection of a certain disease in cattle is

25%. 10 An experimental vaccine is injected intonn animals. If it is wholly ineffectual,

the probability that exactlykk animals remain free from infection isb left parenthesis k semicolon n comma 0.75 right parenthesisb(k; n, 0.75); for

k equals n equals 10k = n = 10, this probability is approximately 0.056; the probability that 1 animal

out of 17 becomes infected is slightly lower, approximately 0.050, and for 2 out of

9 If and only if.

10 Due to P. V. Sukhatme and V. G. Panse, quoted by Feller (1967), Chap. 6.